322 ◾ Bioinformatics
Before we perform the taxonomic binning, we need to generate sequence depth from the
sorted BAM files produced by mapping the metagenomic FASTQ reads to the de novo
assemblies. For this purpose, we will also need the tables produced by “get_count_table.
py” script above as an input with the sorted BAM file for the “jgi_summarize_bam_con-
tig_depths” function to produce a file of five columns: contig name, contig length, total
average depth, mean depth, and variance.
mkdir stats_metabat
jgi_summarize_bam_contig_depths \
--outputDepth stats_metabat/healthy_depth.txt \
sam_assemblies/ERR1823587_healthy.bam.sorted
jgi_summarize_bam_contig_depths \
--outputDepth stats_metabat/moderate_depth.txt \
sam_assemblies/ERR1823601_moderate.bam.sorted
jgi_summarize_bam_contig_depths \
--outputDepth stats_metabat/severe_depth.txt \
sam_assemblies/ERR1823608_severe.bam.sorted
Then, we can perform binning on the contigs.fasta produced by de novo assembly above.
We can copy these files in a new directory “binning” with new names.
mkdir binning
cp metag_healthy/contigs.fasta binning/healthy_contigs.fasta
cp metag_moderate/contigs.fasta binning/moderate_contigs.fasta
cp metag_severe/contigs.fasta binning/severe_contigs.fasta
The next step is to separate the contigs in the contigs files into bins; each bin represents a
species. The bins of the three samples are saved in different subdirectories inside “binning”
directory.
mkdir binning/healthy
metabat2 -i binning/healthy_contigs.fasta \
-a stats_metabat/healthy_depth.txt \
-o binning/healthy/healthy \
-t 4 -v --seed 123
mkdir binning/moderate
metabat2 -i binning/moderate_contigs.fasta \
-a stats_metabat/moderate_depth.txt \
-o binning/moderate/moderate \
-v --seed 123
mkdir binning/severe
metabat2 -i binning/severe_contigs.fasta \
-a stats_metabat/severe_depth.txt \
-o binning/severe/severe \
-v --seed 123